Large-Scale Computational Inference era
The era's representative figures include Marcus, Peritz, and Gabriel, who proposed the closed testing framework to control the family-wise error rate in multiple testing, and Sture Holm, who developed the sequentially rejective Bonferroni method to improve power while preserving error control. Ker-Chau Li introduced sliced inverse regression in 1991 as a dimension-reduction technique for high-dimensional regression, enabling scalable inference with many predictors. Benjamini and Hochberg introduced the false discovery rate controlling procedure in 1995, providing a scalable criterion for error control in large-scale testing. John Chambers and colleagues at Bell Labs developed the S language, a programmable statistical computing environment that bridged theory and computation and supported reproducible data-analysis pipelines.
Open-Source Statistical Ecosystem era
In the Open-Source Statistical Ecosystem era, Ross Ihaka and Robert Gentleman founded the R language in the 1990s, unleashing a community-driven CRAN package landscape that anchored reproducible statistics. Hadley Wickham shaped modern R practice by creating the tidyverse and key packages such as ggplot2, dplyr, and tidyr, standardizing data frames and providing modular tooling for analysis. Across Python, Wes McKinney's pandas introduced a cohesive DataFrame paradigm, complemented by Travis Oliphant's NumPy foundation, enabling scalable data analysis within a cross-language ecosystem. Fernando Pérez and Brian Granger advanced interactive, reproducible research with IPython and Jupyter notebooks, while Paolo Di Tommaso with Snakemake and Johannes Köster with Nextflow pushed cloud-ready workflows for scientific pipelines.